I am starting to investigate what is going on in small scales in my co-evolution simulation. I decided to get newt and snake information and see in they were correlated in n-n grids. I collected correlated data from the GA experiments from the 5,000, 20,000, and 100,000 simulations. (I also tested the correlations with and without perUnitArea=T. It turns out when you don’t have perUnitArea=T the edge of the simulation has a greater effect on the calculations turning the correlations positive.) I also will examine how the correlation values change when the grid has more boxes (smaller area to compare). I will specifically, test 5 by 5, 8 by 8, 10 by 10 and 20 by 20 grids
My simulation has newts and snakes coevolving in an arms race! The newts are toxic and interact with resistant snakes. Snakes are attacking and eating the newts, and in order to eat more newts they are becoming more resistant. However, if a snake eats a newt that is more toxic the snake dies. As snakes become more resistant, less toxic newts are being devoured, leaving only the most toxic newts. These toxic newts reproduce and create even more toxic newts. The cycle contentiously repeats. My simulation is run in a two-dimensional space with a x and y axis.
GA1 experiment values:
GA2 experiment values:
collected info:
Tested Grids
Something I noticed when playing around with different variables in my simulation was that there was a large effect when I was changing interaction rate. So I will also look at that here. Special note when interaction rate is increased newts and snake may avoid each other.
Before running this experiment, I thought about the possible outcomes. I am looking at relationships between species phenotypes, population size, and resistance and toxicity between different sized regions. I predicted that the number of newts and the amount of toxicity & the number of snakes and the amount of resistance will be very correlated (green line). I also predicted that when newts and snakes are co-evolving then their phenotypes would be positively correlated (blue line). I think that the number of newts will be negativity correlated with the population size of snakes and vice-versa (purple line). For example, if there are more snakes in a specific region (and the amount of snakes are increasing) then there wold be less newts (and the amount of newts would be decreasing). I also predict that the amount of toxin in a region will be negatively correlated with the population size of snakes & that resistance will also be negatively correlated with the population size of newts (red line). For example, if the newts on a specific region become more toxic (increasing the total amount of toxicity) the number of snakes will decrease (as they try to eat a more toxic newt they die). Below is a graphical representation of my predictions.
After, plotting the results number of newts by the amount toxin and the number of snakes by the amount of resistance was very close to 1. It is not plotted in my main results.
First, I process the data that I collected from the cost 1on1 slim simulations. There are 16 different simulations in the GA1 experiment and 25 different simulations in the GA2 experiment (I will refer to these simulations as groups). Here I will output the groups simulation letter along with the mutation rate and effect size of both newts and snakes. I collected data from 8 different simulations groups testing what happens as the length of the simulation increases as and testing how the size of the measured area effects the correlation calculations.
## [1] "Simulation A: Snake mu-rate & effect sd (1.0e-08, 0.005) Newt mu-rate & effect sd (1.0e-08, 0.005)"
## [1] "Simulation B: Snake mu-rate & effect sd (1.0e-08, 0.005) Newt mu-rate & effect sd (1.0e-09, 0.05)"
## [1] "Simulation C: Snake mu-rate & effect sd (1.0e-08, 0.005) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation D: Snake mu-rate & effect sd (1.0e-08, 0.005) Newt mu-rate & effect sd (1.0e-11, 5.0)"
## [1] "Simulation E: Snake mu-rate & effect sd (1.0e-09, 0.05) Newt mu-rate & effect sd (1.0e-08, 0.005)"
## [1] "Simulation F: Snake mu-rate & effect sd (1.0e-09, 0.05) Newt mu-rate & effect sd (1.0e-09, 0.05)"
## [1] "Simulation G: Snake mu-rate & effect sd (1.0e-09, 0.05) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation H: Snake mu-rate & effect sd (1.0e-09, 0.05) Newt mu-rate & effect sd (1.0e-11, 5.0)"
## [1] "Simulation I: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-08, 0.005)"
## [1] "Simulation J: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-09, 0.05)"
## [1] "Simulation K: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation L: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-11, 5.0)"
## [1] "Simulation M: Snake mu-rate & effect sd (1.0e-11, 5.0) Newt mu-rate & effect sd (1.0e-08, 0.005)"
## [1] "Simulation N: Snake mu-rate & effect sd (1.0e-11, 5.0) Newt mu-rate & effect sd (1.0e-09, 0.05)"
## [1] "Simulation O: Snake mu-rate & effect sd (1.0e-11, 5.0) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation P: Snake mu-rate & effect sd (1.0e-11, 5.0) Newt mu-rate & effect sd (1.0e-11, 5.0)"
Correlation values were collected every 10th generation with the exception of the 1st generation (calculations started gen 2). The first figure presented is the correlation calculations for simulation A up to 5,000 generations with a 5 by 5 grid. The red line represents calculation A which is the correlation between mean newt phenotype and mean snake phenotype (I predict this would be positive). The green line (CB) is the correlation between the number of newts and the number of snakes (I predicted this would be negative). The blue line is (CC) and represents the correlation between the sum of toxicity and the number of snakes. The purple line (CD) is the correlation between the amount of resistance and the number of newt (predicted to be negative). The second figure shows the correlation calculations for all 16 simulations. The third figure shows the mean newt phenotype (red) the snake mean phenotype (blue) and the difference between the snake mean phenotype and the newt mean phenotype (black). I want to use the third figure which is a whole population calculation to the second figure which has a more localized calculation.
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
Looking at the figures, can I really tell what is going on?
In the first figure of the correlation calculations in simulation A my answer would be “no”. There are some concerns that I have. My first concern is why is CA starting out a perfect correlation. The second concern is the variability between each measurement. The points seem to jump around a lot, sometime being negative to being positive in 10 or so generations. These correlation values look like they are running through a cycle. I do not know if this is concerning but the blue, green, and purple lines are practically on top of each other. Is this due to both newts and snakes having the same GA or is the amount of newts (population size or phenotype) have similar trends to snakes?
In the second figure my answer would be “maybe”. There is definitely a difference between the correlation values between these simulations, seen especially in the red line. Sometimes the red line is very high (correlation of 1) and sometimes it is lower or sometimes negative. For the most part the green, blue, and purple lines are close to each other. However, these three lines do differ in some of the simulations particularly when one of the species has a larger effect size.
In the third figure I would say there is something going on, but is it coevolution “IDK”. I have plotted this figure before and talked about its implications in Comparing GA. But now I want to combine what we see here with the figure above it. Let’s first look at what the correlation lines are doing when the mean newt phenotypes of newts and snakes are similar/close or cross (A, B, D, E, F, G, M, N). In about half of these there is a high correlation between newt and snake phenotype, but in the other half the correlation is low or a bit negative. If we look at the correlations for the when the mean newt and snake phenotypes are further apart (C, H, I, J, K, L, O, P) the correlation between mean newt phenotype and mean snake phenotype is smaller. However, there might be an increase in the mean newt and mean snake correlation when the newt phenotype is (significantly) higher than the snake phenotype (C, O, P). I wonder if there is any avoiding occurring? Simulation N is a particular interesting case to look at.
This subsection is looking at the above figures, but at a zoomed in section. I only look at the first 200 generations. The goal is to see where the correlation calculations begin and see how they change on a smaller, easier to see time scale.
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
Here the four correlations are calculated and collected every 10 generations. The mean phenotype of newts and snakes are collected every 20 generations. The difference in collect times might lead to more smoothness in one plot vs the other (in this case I don’t think that is happening). In the first graph I noticed that the red line (correlation of mean newt phenotype with mean snake phenotype) starts out very positively correlated in some of the simulations. Is this high positive correlation normal or random? When looking at both figures its weird how differently the correlation look when it seems like the mean phenotype of newts and snakes are unchanging. By looking at the correlation plot I wonder what is occurring to create some positive and negative correlations especially for CB, CC, and CD? Because nothing seems dramatic from the total population phenotype mean. What other information might be useful?
The next section I run the GA1 simulations for 20,000 generations and 100,000 generations. To see if there are any major correlation changes that might occur when running my simulation for longer.
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
The 20,000 generation plots looks very similar to the 5,000 generation plots. It does not look like anything dramatically changes.
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
Wow, it is extremely hard to tell what is going on when you plot all 100,000 generations at the same time! Something is going on that differentiates these simulations from each other. Some of the simulations have a lower CA or higher CD. Overall, it does not look like increasing the amount of generations has an effect of the correlation values (after a certain amount of generations nothing really changes, but maybe it cycles). It is difficult to determine what is going on in the simulation by looking at correlations alone. Take F and H from the 100,000 generation for an example. Both F and H have a higher CA throughout the simulation CB, CC, and CD are all about the same, but the average mean phenotype for the simulation looks very different. Maybe looking at the variance between different simulations peaks and valleys will give me a better idea of what is happening?
Here I am going to condense the data down to 1 box plot per calculation per simulation over the last 1,000 generations (of the 5,000, 20,000, and 100,000 generation simulations). This will hopefully be an easier plot to read and compare results between different simulation lengths. There will be three plots the first plot (top left) will be the results for the last 1,000 generations of the 5,000 generation simulation. The second plot (top right) will be the results for the last 1,000 generation of the 20,000 generation plot. The third plot (bottom left) will be the results of the last 1,000 generations of the 100,000 generation simulation. All plots will show the 16 genetic architecture experiments (in this case they are unlabeled due to space, but each starts with A and continues left to right then down to the next line). The x-axis and y-axis are also unlabeled (due to space), but represents which correlation was calculated and its value respectively. The four correlation calculations see printed below:
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
## [1] "Difference in simulation length"
The box plots differ between each of the 16 simulations, but are relatively similar between the different simulation lengths (when only looking at 1,000 generations). I was surprised that the box plots had similar lengths between the different simulation times. The correlation between newt phenotype and snake phenotype (CA red) is mostly positive, but can be negative or zero. The correlation between number of newts (newt density) and number of snakes (snake density) (CB green), the correlation between the amount of toxin and the number of snakes (CC blue), and the correlation between the amount of resistance and the number of snakes (CD purple) is mostly negative. These values are negative and do not get past -0.5.
I re-ran the 5,000 generation simulation, but changed the grid size by decreasing the area where the correlation could be calculated. In this case I tried to divide an 35 by 35 area by 8, 10, and 20 boxes. As the number of boxes increase the area in which things may be calculated decreases. The goal is to get close to finding a good amount of area to calculation local adaptation, while not being too large or too small. I will compare the results of the following simulations. Note: I ran 16 slim simulations 4 times (setting the grid at 5, 8, 10, 20) with the same msprime files
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
When I increased the number of grids to 8 by 8, the correlation values got lower. The different correlation calculations that I made were less correlated and do not fluctuate between measurements as much as they did when the grid was 5 by 5.
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
When I increased the number of grids to 10 by 10, the correlation values looked very similar to the 8 by 8 grid. The 10 by 10 grid had less correlated vales when compared to the 5 by 5 grid. These correlation calculations were less correlated and do not fluctuate between measurements as much as they did when the grid was 5 by 5. It is also possible that the 10 by 10 grid was just a little bit less correlated than the 8 by 8, but it is a bit difficult to tell.
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
When I change the grid to a 20 by 20 everything was negatively correlated and the correlations did not jump around.
Similar to what I did before, I made box plots of the last 1,000 generation of the 5,000 generation simulations. Here, I will visually compare how changing the amount of area where the correlation calculations took place effect the correlation output. In the top left will be the box plots of the correlations from the 5 by 5 grid to the right is the 8 by 8 grid. Under the 5 by 5 grind will be the 10 by 10 grid and next to it will be the 20 by 20 grid box plots. To save space and see the results there are no labels on the x-axis (correlation calculation), the y-axis (correlation values), or the simulation letter (A starts on the top left and continues right and then to the next line). The four correlation calculations are printed below:
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
## [1] "Difference in amount of grids"
At the number or grids increases the area in which individuals (phenotype or count) shrinks. This area shrinkage can lead to area where few individuals are which could have an effect on the correlation calculations. By decreasing the size of the measured area the variance between data collect points decreases (easily spotted with the red box plot newt phenotype by snake phenotype). When I increased the grid number there was less correlation both in the positive and negative directions. When the grid value was very high (20) everything became centered around 0. What is the best method for choosing the grid size? Is the correlation between the things I measure proof of co-evolution (especially when some of the values started of at 1)?
Now, I am going to look at the correlations that I calculated for the GA2 experiment. In this experiment the variance for mutations is equal between the species (mu*(effect_size^2)). Below, is the simulation letter and its associated mutation rate and effect size.
## [1] "Simulation A: Snake mu-rate & effect sd (1.0e-08, 0.05) Newt mu-rate & effect sd (1.0e-08, 0.05)"
## [1] "Simulation B: Snake mu-rate & effect sd (1.0e-08, 0.05) Newt mu-rate & effect sd (1.0e-09, 0.158114)"
## [1] "Simulation C: Snake mu-rate & effect sd (1.0e-08, 0.05) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation D: Snake mu-rate & effect sd (1.0e-08, 0.05) Newt mu-rate & effect sd (1.0e-11, 1.58114)"
## [1] "Simulation E: Snake mu-rate & effect sd (1.0e-08, 0.05) Newt mu-rate & effect sd (1.0e-12, 5.0)"
## [1] "Simulation F: Snake mu-rate & effect sd (1.0e-09, 0.158114) Newt mu-rate & effect sd (1.0e-08, 0.05)"
## [1] "Simulation G: Snake mu-rate & effect sd (1.0e-09, 0.158114) Newt mu-rate & effect sd (1.0e-09, 0.158114)"
## [1] "Simulation H: Snake mu-rate & effect sd (1.0e-09, 0.158114) Newt mu-rate & effect sd (1.0e-10, 0.05)"
## [1] "Simulation I: Snake mu-rate & effect sd (1.0e-09, 0.158114) Newt mu-rate & effect sd (1.0e-11, 1.58114)"
## [1] "Simulation J: Snake mu-rate & effect sd (1.0e-09, 0.158114) Newt mu-rate & effect sd (1.0e-12, 5.0)"
## [1] "Simulation K: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-08, 0.05)"
## [1] "Simulation L: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-09, 0.158114)"
## [1] "Simulation M: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation N: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-11, 1.58114)"
## [1] "Simulation O: Snake mu-rate & effect sd (1.0e-10, 0.5) Newt mu-rate & effect sd (1.0e-12, 5.0)"
## [1] "Simulation P: Snake mu-rate & effect sd (1.0e-11, 1.58114) Newt mu-rate & effect sd (1.0e-08, 0.05)"
## [1] "Simulation Q: Snake mu-rate & effect sd (1.0e-11, 1.58114) Newt mu-rate & effect sd (1.0e-09, 0.158114)"
## [1] "Simulation R: Snake mu-rate & effect sd (1.0e-11, 1.58114) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation S: Snake mu-rate & effect sd (1.0e-11, 1.58114) Newt mu-rate & effect sd (1.0e-11, 1.58114)"
## [1] "Simulation T: Snake mu-rate & effect sd (1.0e-11, 1.58114) Newt mu-rate & effect sd (1.0e-12, 5.0)"
## [1] "Simulation U: Snake mu-rate & effect sd (1.0e-12, 5.0) Newt mu-rate & effect sd (1.0e-08, 0.05)"
## [1] "Simulation V: Snake mu-rate & effect sd (1.0e-12, 5.0) Newt mu-rate & effect sd (1.0e-09, 0.158114)"
## [1] "Simulation W: Snake mu-rate & effect sd (1.0e-12, 5.0) Newt mu-rate & effect sd (1.0e-10, 0.5)"
## [1] "Simulation X: Snake mu-rate & effect sd (1.0e-12, 5.0) Newt mu-rate & effect sd (1.0e-11, 1.58114)"
## [1] "Simulation Y: Snake mu-rate & effect sd (1.0e-12, 5.0) Newt mu-rate & effect sd (1.0e-12, 5.0)"
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"
## [1] "CA = mean_newt_pheno_By_mean_snake_pheno"
## [1] "CB = num_newts_By_num_snakes"
## [1] "CC = sum_newt_pheno_By_num_snake"
## [1] "CD = sum_snake_pheno_By_num_newt"